World of Education

home *** CD-ROM | disk | FTP | other *** search

/ World of Education / World of Education.iso / world_g / gaw110.zip / GAW.FX < prev next >

Wrap

Text File | 1992-06-27 | 67KB | 1,662 lines

M _G_e_n_e_t_i_c _A_l_g_o_r_i_t_h_m _W_o_r_k_b_e_n_c_h _D_o_c_u_m_e_n_t_a_t_i_o_n J _T_h_e _G_e_n_e_t_i_c _A_l_g_o_r_i_t_h_m _W_o_r_k_b_e_n_c_h _p_r_o_g_r_a_m _a_n_d _i_t_s _d_o_c_u_m_e_n_t_a_t_i_o_n _a_r_e _c_o_p_y_- _r_i_g_h_t _a_n_d _m_a_y _n_o_t _b_e _c_o_p_i_e_d _o_r _d_i_s_t_r_i_b_u_t_e_d _w_i_t_h_o_u_t _w_r_i_t_t_e_n _p_e_r_m_i_s_s_i_o_n _f_r_o_m _t_h_e _a_u_t_h_o_r _w_i_t_h _t_h_e _f_o_l_l_o_w_i_n_g _e_x_c_e_p_t_i_o_n_s: J (_1) _C_o_p_i_e_s _o_f _t_h_e _s_o_f_t_w_a_r_e _a_n_d _t_h_i_s _d_o_c_u_m_e_n_t_a_t_i_o_n _m_a_y _b_e _m_a_d_e _a_n_d _p_a_s_s_e_d _o_n _t_o _a_n_y _t_h_i_r_d _p_a_r_t_y _p_r_o_v_i_d_e_d _t_h_a_t _a_l_l _t_h_e _f_i_l_e_s _o_n _t_h_e _d_i_s_t_r_i_b_u_t_i_o_n _d_i_s_k _a_r_e _d_i_s_t_r_i_b_u_t_e_d _t_o_g_e_t_h_e_r _i_n _u_n_m_o_d_i_f_i_e_d _f_o_r_m, _a_n_d _p_r_o_v_i_d_i_n_g _t_h_a_t _n_o _p_r_o_f_i_t _i_s _m_a_d_e _f_r_o_m _s_u_c_h _d_i_s_t_r_i_b_u_t_i_o_n. J (_2) _A _r_e_a_s_o_n_a_b_l_e _n_u_m_b_e_r _o_f _c_o_p_i_e_s _m_a_y _b_e _m_a_d_e _o_f _t_h_e _s_o_f_t_w_a_r_e _f_o_r _t_h_e _p_u_r_p_o_s_e _o_f _a_r_c_h_i_v_i_n_g _t_o _g_u_a_r_d _a_g_a_i_n_s_t _c_o_r_r_u_p_t_i_o_n _o_f _t_h_e _w_o_r_k_i_n_g _c_o_p_y _o_f _t_h_e _s_o_f_t_w_a_r_e. J _T_h_e _s_o_f_t_w_a_r_e _c_a_n _b_e _u_s_e_d _w_i_t_h_o_u_t _r_e_s_t_r_i_c_t_i_o_n _o_r _p_a_y_m_e_n_t, _b_u_t _y_o_u _a_r_e _e_n_c_o_u_r_a_g_e_d _t_o _s_e_n_d _a_n _a_p_p_r_o_p_r_i_a_t_e _c_o_n_t_r_i_b_u_t_i_o_n _i_n _s_t_e_r_l_i_n_g _t_o _t_h_e _a_u_t_h_o_r _i_f _y_o_u _f_e_e_l _t_h_a_t _t_h_e _p_r_o_g_r_a_m _h_a_s _b_e_e_n _o_f _u_s_e. _S_e_e _s_e_c_t_i_o_n _4 _f_o_r _t_h_e _a_u_t_h_o_r'_s _a_d_d_r_e_s_s. J _N_o _w_a_r_r_a_n_t_y _i_s _g_i_v_e_n _t_h_a_t _t_h_i_s _s_o_f_t_w_a_r_e _i_s _f_i_t _f_o_r _a_n_y _p_u_r_p_o_s_e, _n_o_r _t_h_a_t _i_t _w_i_l_l _p_e_r_f_o_r_m _a_s _d_e_s_c_r_i_b_e_d _i_n _t_h_i_s _m_a_n_u_a_l. _Y_o_u _u_s_e _i_t _e_n_t_i_r_e_l_y _a_t _y_o_u_r _o_w_n _r_i_s_k. J _C_o_p_y_r_i_g_h_t (_C) _M_a_r_k _H_u_g_h_e_s _1_9_8_9. _A_l_l _r_i_g_h_t_s _r_e_s_e_r_v_e_d. _L_a_s_t _C_h_a_n_g_e: _3_0 _N_o_v_e_m_b_e_r _1_9_8_9. J _C_O_N_T_E_N_T_S J 1. Introduction 2. Purpose 3. Using the Genetic Algorithm Workbench Program 3.1 Hardware Requirements 3.2 Running the Program 3.3 Screen Display 3.4 Menu Commands 3.5 Program Control Variables 4. Bibliography 5. Appendix A - Workbench Algorithms 5.1 Solving Problems with a Genetic Algorithm 5.2 Genetic Coding 5.3 Genetic Algorithm Implementation 5.4 Summary of Algorithm Input Variables 5.4.1 Population 5.4.2 Fitness Scaling 5.4.3 Breeder Selection 5.4.4 Generation Gap 5.4.5 Mates Selection 5.4.6 Mating 5.4.7 Crossover 5.4.8 Mutation Probability 5.4.9 Dispersal 5.4.10 Crowding Factor 5.4.11 Elitism 5.4.12 Sacrifice Selection 5.5 Output Variables 5.6 References 6. Appendix B - A Typical Session Using the Workbench 7. Appendix C - Problems and How to Fix Them 8. Appendix D - General Introduction to Genetic Algorithms 9. Appendix E - Main Command Menu J G1 H_1. _I_n_t_r_o_d_u_c_t_i_o_n J This is a user's manual for the Genetic Algorithm Workbench program which is an interactive tool for demonstrating and experimenting with genetic algorithms using an IBM compatible personal computer. This is not a set of subroutines for inclusion in your own programs. If that is what you require, see the bibliography for details of GENESIS. J The manual commences with a description of the purpose of the Work- bench in section 2. J Section 3 then describes how to use the program and gives details of hardware requirements, instructions for running the program, an ex- planation of the screen display, and explains how to control the program using the command menu and program control variables. J Section 4 contains a short bibliography. J Appendix A describes the detailed operation of the genetic algorithm employed by the Workbench and the effect of each algorithm input variable. It gives details of where to find further information about the theory of the different aspects of the genetic algorithm which are described. An explanation of each output variable displayed on the screen is also given. J Appendix B contains a short step by step example of using the Work- bench. J Appendix C may help if you encounter problems trying to use the Workbench. J Appendix D includes an article as a general introduction to genetic algorithms and their applications. J Appendix E is a brief summary of the Workbench command menu. _2. _P_u_r_p_o_s_e J The purpose of the Workbench program is to allow experimentation with search and optimisation algorithms. It is primarily a tool for experimenting with different types of genetic algorithm, but is also intended for use in comparing genetic algorithms with other tech- niques although so far only the genetic algorithm has been imple- mented. G2 HA genetic algorithm is a method for finding the peaks of difficult functions, and the Workbench program is both for demonstrating genetic algorithms and for evaluating their performance. J The idea is that you provide a function, the "Target Function" and see how quickly the selected algorithm is able to find the peak value, or indeed if it succeeds at all. You can vary the details of the algorithm used by tweaking several numeric control parameters and selecting different types of operator employed by the algorithm. _3. _U_s_i_n_g _t_h_e _G_e_n_e_t_i_c _A_l_g_o_r_i_t_h_m _W_o_r_k_b_e_n_c_h _P_r_o_g_r_a_m J The following sections describe hardware required and provide in- structions for starting the Workbench program. The screen display is then explained followed by details of how to control the program, and finally a description of menu commands is given. J Appendix B describes a typical session using the Workbench, and will also be useful while learning how to use the program. _3._1. _H_a_r_d_w_a_r_e _R_e_q_u_i_r_e_m_e_n_t_s J To run the genetic algorithm Workbench, you require J IBM PC/XT/AT or compatible EGA display Microsoft compatible mouse J J The Workbench has been tested on MS-DOS version 3.3 but should work on most versions of MS-DOS and PC-DOS. Reports of problems with other versions will be appreciated. J The program will work on a VGA mode screen, but unless you can put it in EGA emulation mode, the program will only use the lower two thirds of the screen, giving a squashed display. _3._2. _R_u_n_n_i_n_g _t_h_e _P_r_o_g_r_a_m J Check that your system hardware is suitable for running the genetic algorithm workbench (see above). J Switch on the computer and wait for your MS-DOS prompt to appear on the screen. If you have a display adapter board that can emulate several display modes, ensure that it is in EGA mode. This might be J G3 Hdone by setting configuration switches on the adapter card prior to switching your computer on, or by means of a special command provid- ed with your display adapter. Refer to your display adapter manual for details of how to set it to EGA mode. J Now, ensure that your mouse driver is loaded. On some systems a command such as "MOUSE" is provided to load the driver, on others a command must be inserted into your config.sys file. Refer to the documentation for your mouse on how to install the mouse driver. J Now the final test. Insert the Genetic Algorithm Workbench disk into drive a: and type J a:gaw J J and press the <ENTER> key. J The program takes a little while to load but you should see a display similar to that shown in figure 1 (see next section), and if you move your mouse, the mouse cursor should be visible and will change as you move it over different areas of the screen. J If you don't think everything is as described here, refer to appen- dix C which describes possible problems and how to deal with them. _3._3. _S_c_r_e_e_n _D_i_s_p_l_a_y J Figure 1 shows a screen display similar to the one you should see when the Workbench is first started. The main features of the display are as follows. J _C_o_m_m_a_n_d _M_e_n_u This is a menu of commands shown at the top left of the screen which are activated using the mouse. Each command is highlighted when the mouse cursor overlays it, and is executed if the left mouse button is pressed while the command is highlighted. See section 3.5, Program Control for details of these commands. J _A_l_g_o_r_i_t_h_m _C_o_n_t_r_o_l _C_h_a_p_t_e_r This is the large box which spans the top of the screen to the right of the command menu. It is called a _c_h_a_p_t_e_r because it can contain several _p_a_g_e_s, only one of which is visible at one time. Initially, it displays a page called "Simple Genetic Al- gorithm" which shows a number of input variables and their values, which are used to control the operation of this algo- J G4 Hrithm. Pages can be flipped through, forwards or backwards, by clicking the left mouse button on the arrows in the top right hand corner of the chapter. Each page is described below. J _S_i_m_p_l_e _G_e_n_e_t_i_c _A_l_g_o_r_i_t_h_m _P_a_g_e This is the box within the algorithm control chapter entitled "Simple Genetic Algorithm". (Note that it may be necessary to select this page using the mouse before it is displayed within the chapter box - see "Algorithm Control Chapter" above.) This page is normally displayed when the Workbench is first started and lists a number of control inputs to the genetic algorithm together with their current values. The page shows two columns of input variables. Each variable displayed shows its name to the left, a pair of arrows in the middle, and the variable's current value to the right. Note that many of the variable values are text strings. You can alter the value of any of these variables by clicking the left mouse button on the up or down arrows to the left of each value. The meaning of each variable is described in appendix A. J _G_e_n_e_r_a_l _P_r_o_g_r_a_m _C_o_n_t_r_o_l _V_a_r_i_a_b_l_e_s _P_a_g_e This is the box within the algorithm control chapter entitled "General Program Control Variables". (Note that it may be necessary to select this page using the mouse before it is displayed within the chapter box - see "Algorithm Control Chapter" above.) This page contains variables related to gen- eral program operation rather than a specific algorithm. The meaning of each program control variable is described in sec- tion 3.5. J _O_u_t_p_u_t _V_a_r_i_a_b_l_e_s _B_o_x This is the box at the bottom right of the screen which con- tains the current values of a number of variables relating to the current algorithm. These variables cannot be altered by the user; they indicate the current state of the algorithm. Each output variable is described in appendix A. J _T_a_r_g_e_t _F_u_n_c_t_i_o_n _G_r_a_p_h This is the graph labelled "Target Function" and is used for entry and display of the function to be solved. Using the Workbench involves drawing functions on this graph which are then solved using the selected algorithm. After a function has been provided, a small red triangle on the x axis marks the highest peak, which is the target for the algorithm to find. J _P_o_p_u_l_a_t_i_o_n _D_i_s_t_r_i_b_u_t_i_o_n _H_i_s_t_o_g_r_a_m This is the plot entitled "Population Distribution" immediately below the target function graph and shows the distribution of organisms by value of x for the genetic algorithm. G5 H_O_u_t_p_u_t _G_r_a_p_h This is the graph labelled "Output Plot" and is used to display plots of various output variables against time. J _A_x_i_s _V_a_l_u_e _B_o_x This box labelled axis value is used in combination with the mouse cursor to read values from any of the graphs described above. When the mouse cursor is moved over the plot area of any graph, it changes to a cross hair and causes the Axis Value box to display the coordinate values of the corresponding graph at the point indicated by the cross hair. J _F_i_g_u_r_e _1 - _E_x_a_m_p_l_e _S_c_r_e_e_n _D_i_s_p_l_a_y J _3._4. _M_e_n_u _C_o_m_m_a_n_d_s J The program is controlled entirely through use of a mouse. General com- mands such as starting and stopping the algorithm are invoked through a menu. The target function is input by drawing the function on a graph using the mouse cursor, and algorithm and program control variables can be altered by clicking the mouse over the up and down arrows to the left of each value. J This section describes each menu command. Program control variables are explained in the following section. J The functions of the command menu shown in the top left of the screen are as follows: JJ G6 H_R_e_d_r_a_w Redraws the whole screen. J _S_t_a_r_t _A_l_g/_S_t_o_p _A_l_g Start/pause algorithm operation. Note that an algorithm can only be run if it the algorithm chapter is showing the corresponding al- gorithm page. No algorithm can run while the chapter is displaying the page relating to general program control variables. J _S_t_e_p _A_l_g No function. Currently unimplemented. J _R_e_s_e_t _A_l_g Resets algorithm ready for a run. See the relevant appendix for details of reseting an algorithm. J _P_l_o_t _D_a_t_a Plots currently selected data (see description of "Plot data" vari- able later), on the output plot. This displays the selected vari- able against time for the duration of the current algorithm run. J _P_l_o_t _T_a_r_g Re-plot the target function. After redrawing the entire screen the target function graph is cleared. This command allows the current target function to be redrawn, but note that it will have no effect until a function has been entered using the "Enter Targ" command described next. J _E_n_t_e_r _T_a_r_g Enter or re-enter target function. After executing this command, the mouse cursor is moved to the target function graph allowing a target function to be drawn. Clicking with the left mouse button plots a point for the function, and clicking with the right deletes a point. You should draw a function which spans the full width of the x axis from left to right and uses as few points as possible. Functions with many points slow down the algorithms. When you are happy with the function, press both left and right mouse buttons together. J _Q_u_i_t Exit the program. J _T_e_s_t Tests my jump-up menus (no function). Play by all means, but this has nothing to do with the Workbench program. G7 H_3._5. _P_r_o_g_r_a_m _C_o_n_t_r_o_l _V_a_r_i_a_b_l_e_s J The program control variables are shown on the page labelled "General Program Control Variables". The meaning of each program control vari- able explained below. (Algorithm control variables are explained in ap- pendix A.) J Program control variable meanings: J _P_l_o_t _d_a_t_a This variable is used to determine the source of data for plotting on the graph entitled "Output Plot". The selections available in- clude values (but not all values) of output variables from those displayed in the "Output Variables" box. J _O/_P _P_l_o_t _X-_m_a_x This variable sets scale of the output plot X axis by fixing the maximum x value that can be displayed. J _O/_P _P_l_o_t _Y-_m_a_x This variable sets scale of the output plot Y axis by fixing the maximum y value that can be displayed. J _P_l_o_t _p_e_r_i_o_d This variable determines the frequency with which the population distribution histogram is updated. A value of 1 causes an update for every iteration (or generation) of the algorithm. A value of 2 causes update every other iteration, 3 every third and so on. J _R_a_n_d_o_m # _s_e_e_d This value is used to seed the program's random number generator each time the algorithm is reset from the command menu. _4. _B_i_b_l_i_o_g_r_a_p_h_y J This section lists some sources of information about genetic algorithms. J A brief and very general introduction to genetic algorithms is given in appendix D which contains a copy of an article from The Guardian newspa- per. J The following text is a comprehensive textbook of genetic algorithm theory and applications up to the year 1989. JJ G8 HGoldberg, D. E. (1989). _G_e_n_e_t_i_c _A_l_g_o_r_i_t_h_m_s _i_n _S_e_a_r_c_h _O_p_t_i_m_i_z_a_t_i_o_n & _M_a_c_h_i_n_e _L_e_a_r_n_i_n_g. Addison-Wesley. J Users wishing to experiment with genetic algorithms in their own pro- grams may be interested in a package called GENESIS. This is a set of very useful subroutines written in C and built into a tool for experi- menting with different "flavours" of algorithm. GENESIS is far more flexible than the Workbench but is not interactive and has no graphical output built in. As an integrated tool, GENESIS will run on most Unix systems, but the genetic algorithm subroutines are easily ported to any system with a standard C compiler. GENESIS can be obtained from its au- thor: J John J. Grefenstette, Navy Center for Applied Research in Artificial Intelligence, Naval Research Laboratory, Washington, D.C. 20375-5000. USA. J J The author of the Workbench program is happy to correspond with users interested in learning more about genetic algorithms, or who wish to discuss their relevance to a particular kind of problem. He can be reached by writing to the following address: J Mark Hughes, 256 Milton Road, Cambridge, CB4 1LQ. UK. Internet: mrh@camcon.co.uk J G9 H_5. _A_p_p_e_n_d_i_x _A - _W_o_r_k_b_e_n_c_h _A_l_g_o_r_i_t_h_m_s J This appendix describes the implementation of the genetic algorithm and operators used in the Workbench program. _5._1. _S_o_l_v_i_n_g _P_r_o_b_l_e_m_s _w_i_t_h _a _G_e_n_e_t_i_c _A_l_g_o_r_i_t_h_m J A genetic algorithm evolves solutions to a problem through natural selection, breeding and genetic variation. This involves generating a population of solutions, measuring their suitability or fitness, selecting the better solutions for breeding which produces a new po- pulation. The process is repeated and gradually the population evolves towards the solution. J In the genetic algorithm Workbench, the problem is to find the value of x for which the target function has a maximum value of f(x). Each individual in the population represents a solution to this problem in the form of a candidate value for x. The suitability or fitness of the individual is simply taken by calculating the value of f(x) for the individual. This leads to an individual whose value of x corresponds to a high value of f(x) being fitter, and conse- quently being given a greater chance of breeding, than an individual whose value of x corresponds to a lower value of f(x). _5._2. _G_e_n_o_t_y_p_e _C_o_d_i_n_g J In the same way that the genetic information of animals is coded as a string (of DNA), the genetic information of each individual, i.e. its value of x, is also coded as a string. In this case as a string of zeros or ones which can be interpreted as a simple binary code. The choice of a string representation is deliberate because it al- lows processes which act on strings of DNA during natural evolution to be implemented by the computer version of the genetic algorithm. J The string coding of each individual is known as its genotype in biological terminology, while its interpretation, i.e. its value of x, is referred to as its phenotype. _5._3. _G_e_n_e_t_i_c _A_l_g_o_r_i_t_h_m _I_m_p_l_e_m_e_n_t_a_t_i_o_n J The genetic algorithm implemented by this program boils down to the following steps. (Note that a number of new terms, shown in ital- ics, are introduced during the following steps without full explana- tion. These terms refer to operations that are explained subse- quently.) G10 H(1) Generate an initial population of organisms. The random number generator is seeded with the value of "Random # seed" (see sec- tion 3.5), and a new population of organisms is generated each with a different random genotype. This happens whenever the "Reset Alg" command is invoked from the main menu. The command also resets the generation counter to 0 and clears the output variables which are updated during each algorithm run. The number of organisms generated depends on the value of the _p_o_p_u_- _l_a_t_i_o_n input variable. J (2) Calculate and scale new fitnesses. Each new individual's fit- ness is calculated by reading a value of f(x) from the target function at the individual's value of x. If selected, _f_i_t_n_e_s_s _s_c_a_l_i_n_g is now done. J (3) Select individuals for breeding. A subset of the population is selected for breeding. J (4) Breed to produce new population. The set of breeders are taken in random pairs and mated to produce pairs of new organisms, the progeny. J (5) Disperse progeny into the population. The new progeny are in- serted into the population, displacing existing individuals. J After the last step, the algorithm begins again at step 2, starting the next generation. _5._4. _S_u_m_m_a_r_y _o_f _A_l_g_o_r_i_t_h_m _I_n_p_u_t _V_a_r_i_a_b_l_e_s J Each input variable to the simple genetic algorithm is summarised below. _5._4._1. _P_o_p_u_l_a_t_i_o_n J This input variable determines the number of individuals in the po- pulation. That is, the number of candidate solutions being manipu- lated by the algorithm at any time. Too small a population and there will be little opportunity for genetic variation, too large and the algorithm will reduce to a random search. J For the problem posed in the Workbench, populations as low as 5 to 10 can be quite effective but in more complex problems where there are many more possible solutions larger populations are required. However, even for very complex problems, populations rarely exceed a few hundred individuals. J G11 HOne of the reasons why surprisingly small populations can be effec- tive is the intrinsic tolerance of noise exhibited by the genetic algorithm which arises out of the repeated sampling of small chunks of the genetic material (so called schemata) over a number of gen- erations. J There is a discussion of the issues in selecting suitable population sizes in Goldberg 1988. _5._4._2. _F_i_t_n_e_s_s _S_c_a_l_i_n_g J Fitness scaling occurs between the production of new individuals in the population and the use of their fitness values for selection. It is a method of adjusting the probability distribution which determines the likelihood of an individual being selected for breed- ing. It is usually used to emphasise the relatively small differ- ences in relative fitness when a population begins to converge. Without it, the rate of convergence will slow down as diversity de- creases and most individuals in the population have similar fitnesses. J The kind of fitness scaling employed depends on the value of the corresponding input variable which has one of the following values. J _N_o_n_e No scaling. An individual's scaled fitness value is the same as its unscaled value. J _L_i_n_e_a_r Each individual's scaled fitness f' is calculated from its uns- caled fitness f according to the formula J f' = a.f + b J J where a and b are chosen so that the mean scaled fitness is equal to the mean unscaled fitness of the population, and so that the maximum scaled fitness is a given multiple of the max- imum unscaled fitness. The multiple is typically two. J Several methods of fitness scaling are discussed in Goldberg 1989. _5._4._3. _B_r_e_e_d_e_r _S_e_l_e_c_t_i_o_n J Breeder selection involves choosing a number of individuals accord- ing to (scaled) fitness which will be used for breeding. G12 HThe number chosen depends on the number of new individuals required, which is the product of the current population size and the _g_e_n_e_r_a_- _t_i_o_n _g_a_p. The latter is an input variable between 0 and 1 which represents the proportion of the current population replaced during each generation. J The method of selecting the individuals depends on the value of the input variable _b_r_e_e_d_e_r _s_e_l_e_c_t_i_o_n: J _R_o_u_l_e_t_t_e This method is so named because of its similarity to spinning a roulette wheel. In effect, an imaginary roulette wheel is marked out with one slot per individual in the population, but the slots are of differing sizes giving some individuals a better chance of being selected for breeding than others. By making the slot size proportional to the (scaled) fitness of each individual, the fitter individuals have a correspondingly better chance of being selected and passing on their charac- teristics. J The imaginary wheel is spun once for each individual required, one individual being selected per spin. This allows some indi- viduals to be selected more than once and others not to be selected at all. J _E_x_p_e_c_t_e_d _V_a_l_u_e There is a potential problem with roulette wheel selection be- cause it is a stochastic process. In other words, its random element allows some individuals to be selected more often than their fitness deserves (and others to be selected less often). Expected value selection reduces this stochastic error by en- suring that no individual can be selected more than one more time than it deserves. (Obviously some stochastic error must remain because the number of times and individual is selected must be an integer whereas its "selection merit" is a real number). J Both _g_e_n_e_r_a_t_i_o_n _g_a_p and several kinds of _b_r_e_e_d_e_r _s_e_l_e_c_t_i_o_n are dis- cussed in Goldberg 1989. _5._4._4. _G_e_n_e_r_a_t_i_o_n _G_a_p J The _g_e_n_e_r_a_t_i_o_n _g_a_p input variable determines the proportion of each the population replaced during each generation. See breeder selec- tion above. G13 H_5._4._5. _M_a_t_e_s _S_e_l_e_c_t_i_o_n J Following selection of a pool of individuals for breeding, pairs are taken from this pool and bred to produce a pool of progeny. The in- put variable _m_a_t_e_s _s_e_l_e_c_t_i_o_n determines how these pairs are chosen. Currently only one method is supported: J _R_a_n_d_o_m Each individual chosen for mating from the breeding pool is selected at random and is immediately removed from the pool to prevent it being selected again. J In this implementation all individuals are identical and so purely random selection of a mate is always valid. However, more compli- cated schemes are feasible where mating is restricted in some way, perhaps to simulate the formation of niche populations or species. This is discussed further in Goldberg 1989 (p188-197). See also section 5.4.9 which describes dispersal by crowding. _5._4._6. _M_a_t_i_n_g J Having selected a pair of individuals for mating, they are mated to produce new individuals which are collected in a pool of progeny. The method used is determined by the value of the _m_a_t_i_n_g input vari- able, but this currently only supports one method: J _S_i_m_p_l_e Simple mating produces two progeny from two parents as follows. First a copy of each parent is taken and _c_r_o_s_s_o_v_e_r is applied to produce two individuals each of which receives some genetic material from both parents. Finally, _m_u_t_a_t_i_o_n is applied to each individual which may introduce a random change to the genetic material. J Crossover and mutation are described in the following sections. J Currently only simple mating is supported, but many variations can be envisaged, for example incorporating other genetic operators than crossover and mutation. These operators are copied directly from natural processes and their are many other such operators, both na- tural and man made, that can be tried. _5._4._7. _C_r_o_s_s_o_v_e_r J As mentioned earlier, each individual represents a candidate solu- tion to the problem in the form of a string of zeros and ones. J G14 HCrossover is a process which takes two such strings and exchanges portions of the strings to produce two new strings, each of which incorporates information from both the initial strings. Many varia- tions of the crossover operator have been experimented with. The following are available according to the value of the _c_r_o_s_s_o_v_e_r in- put variable: J _S_i_n_g_l_e _P_o_i_n_t A point is chosen at random from the first to the last but one binary digit of one of the strings. The digits following this point are exchanged between the two strings. For example, given the two initial strings and a crossover point indicated by the '^'. J 0 1 0 0 0 0 0 1 0 1 1 0 1 0 ^ J J the following new strings would be produced. J 0 1 0 1 0 1 0 1 0 1 0 0 0 0 ^ J J _T_w_o _P_o_i_n_t Two point crossover is similar except that two distinct points are chosen, again randomly, and the segment between the two points is exchanged. The operator works in such a way that single point crossover is a legal special case of two point crossover. This is the case if either of the two points is at the extremes of the string. J _U_n_i_f_o_r_m Uniform crossover passes along the length of the strings and chooses to take each bit from either one parent or the other ac- cording to some specified probability. The origin of each bit is independent of all the other bits and in this implementation has an equal chance of being taken from either parent. This produces more mixing of bits than the former operators. J The effect of crossover, whatever its form, is to produce new individu- als which contain genetic material from two parents. No new material is introduced, and so there is a limit to the genotypes that can be pro- duced throughout crossover alone. J Several such operators including single point and two point crossover are described in Goldberg 1989. Uniform crossover is described (and J G15 Hcompared with several other crossover operators) in Syswerda 1989. _5._4._8. _M_u_t_a_t_i_o_n _P_r_o_b_a_b_i_l_i_t_y J Mutation involves the chance introduction of a change to any particular bit of an individual's string. Each bit is considered in turn, and is flipped from a zero to a one (or vice versa) with probability determined by the value of the _m_u_t_a_t_i_o_n _p_r_o_b_a_b_i_l_i_t_y input variable. J Mutation can result in new genetic material being introduced into the population, since it can produce values that were not present in either parent, or indeed in the entire population. J Typical mutation rates are of the order one in a hundred to one in a thousand bits. Much higher rates tend to disrupt the action of cross- over, preventing convergence and leading to a more random type of search. _5._4._9. _D_i_s_p_e_r_s_a_l J Dispersal is the method by which progeny are placed into the existing population, which requires some existing individuals to be deleted. This is done by the method indicated by the value of the _d_i_s_p_e_r_s_a_l input variable: J _R_a_n_d_o_m Individuals are chosen at random from the existing population to be replaced by progeny until all progeny have been inserted. Measures are taken to ensure that progeny inserted by the current dispersal are not displaced by the insertion of other progeny. J Crowding Crowding is a mechanism used to prevent premature convergence of the algorithm. The chance of an individual being displaced is made to depend on the degree of similarity between it and the new indi- vidual that is replacing it. When a new individual is ready to be placed into the existing population, it is compared (bit for bit) with a given number of individuals chosen at random from the exist- ing population. The one most like the new individual is chosen to be replaced. The number of individuals chosen for comparison is determined by the value of the _c_r_o_w_d_i_n_g _f_a_c_t_o_r input variable. J Dispersal by crowding is so called because it simulates competition for scarce resources by individuals which are genetically similar and so encourages the formation of niche populations. Mate selec- tion, described in section 5.4.5 is another method that can be used G16 Hto encourage the formation of niche populations. These and other methods are discussed further in Goldberg 1989 (p188-197). _5._4._1_0. _C_r_o_w_d_i_n_g _F_a_c_t_o_r J The _c_r_o_w_d_i_n_g _f_a_c_t_o_r input variable determines the number of individuals that are compared with each new individual during during dispersal by crowding (see above). A crowding factor of 1 would prevent crowding from working and be equivalent to random dispersal. _5._4._1_1. _E_l_i_t_i_s_m J Elitism is a feature that is active dependent on the value (on or off) of the _e_l_i_t_i_s_m input variable. J When active, elitism ensures that the best, or fittest, individual can- not be lost from the population through dispersal. A copy of the fit- test individual in each generation is kept and is re-introduced into the population if, after dispersal, no individual in the new population is at least as fit. J When elitism acts to re-introduce a lost individual it must choose an individual to be replaced. For details of how this is done, see the section entitled "Sacrifice Selection" below. J Elitism is discussed in Goldberg 1989. _5._4._1_2. _S_a_c_r_i_f_i_c_e _S_e_l_e_c_t_i_o_n J The method by which an individual is selected for replacement due to the operation of elitism (see above) is determined by the value of the input variable _s_a_c_r_i_f_i_c_e _s_e_l_e_c_t_i_o_n as follows: J _R_a_n_d_o_m The individual to be replaced is chosen randomly. J _W_e_a_k_e_s_t The weakest (i.e. least fit) individual is chosen. _5._5. _O_u_t_p_u_t _v_a_r_i_a_b_l_e_s J With each generation the display of output variables is updated. Each variable is explained below: G17 H_G_e_n_e_r_a_t_i_o_n The number of generations (iterations of the genetic algorithm) completed so far. J _O_p_t_i_m_u_m _F_i_t_n_e_s_s The maximum value of the function f(x). J _C_u_r_r_e_n_t _B_e_s_t _F_i_t_n_e_s_s The highest value of f(x) for any individual in the current popula- tion. J _A_v_e_r_a_g_e _F_i_t_n_e_s_s The average value of f(x) for all individuals in the current popu- lation. J Note that the above fitness values relate to unscaled fitnesses. J _O_p_t_i_m_u_m _x The value of x which corresponds the the maximum value of f(x) of the target function. J _C_u_r_r_e_n_t _B_e_s_t _x The x of the individual in the population which has the highest value for f(x). J _A_v_e_r_a_g_e _x The average of all x values in the current population. _5._6. _R_e_f_e_r_e_n_c_e_s J Goldberg 1988 Goldberg, D. E. (1988). _S_i_z_i_n_g _P_o_p_u_l_a_t_i_o_n_s _f_o_r _S_e_r_i_a_l _a_n_d _P_a_r_a_l_l_e_l _G_e_n_e_t_i_c _A_l_g_o_r_i_t_h_m_s. TCGA report no. 88005. University of Alabama. J Goldberg 1989 Goldberg, D. E. (1989). _G_e_n_e_t_i_c _A_l_g_o_r_i_t_h_m_s _i_n _S_e_a_r_c_h _O_p_t_i_m_i_z_a_t_i_o_n & _M_a_c_h_i_n_e _L_e_a_r_n_i_n_g. Addison-Wesley. J Syswerda 1989 Syswerda, G. (1989). _U_n_i_f_o_r_m _C_r_o_s_s_o_v_e_r _i_n _G_e_n_e_t_i_c _A_l_g_o_r_i_t_h_m_s. Proceedings of the Third International Conference on Genetic Algo- rithms. Morgan Kaufman. G18 H_6. _A_p_p_e_n_d_i_x _B - _A _T_y_p_i_c_a_l _S_e_s_s_i_o_n _U_s_i_n_g _t_h_e _W_o_r_k_b_e_n_c_h J This appendix describes a typical session using the genetic algorithm Workbench. J To run the Workbench program you require an IBM compatible personal com- puter with EGA display and Microsoft compatible mouse. To start the program, make sure your mouse driver is loaded, your display is in EGA mode and type J gaw J J followed by pressing the <ENTER> key. The program does not have any command line parameters. (See also section 3.2, "Running the Pro- gram".) J Check that the program display is similar to that of figure 1 shown earlier in the manual. If you think something is wrong, refer to _p_r_o_b_l_e_m_s, _a_p_p_e_n_d_i_x _C _w_h_i_c_h _d_e_s_c_r_i_b_e_s _p_o_s_s_i_b_l_e _p_r_o_b_l_e_m_s _a_n_d _t_h_e_i_r _s_o_l_u_t_i_o_n. J Typical use involves the following steps. J 1 Click on "Enter Targ" in the menu and enter a function extend- ing from leftmost to rightmost points on the graph. (Note that clicking the left mouse button sets a point, clicking the right deletes a point, and pressing both together ends input of the function). You should enter a function starting from the graph origin (bottom left), and finishing at the bottom right of the graph. To end entry of the function, press left and right mouse buttons simultaneously. If you go wrong, click on "Enter Targ" and try again. J 2 Click on "Reset Alg" to initialise the genetic algorithm. No- tice the histogram of population distribution in the bottom left hand corner is redrawn, along with the output variables in the box at the bottom right corner of the screen. J 3 Click on "Start Alg" which causes the algorithm to run. Note that the menu option changes its name to "Stop Alg", and so clicking on it again will pause the algorithm. J While running, a count of generations is maintained in the "Output Variables" box, along with several other algorithm variables. Every so often, the histogram is updated as the population attempts to converge on the value of x which corresponds to the highest peak in G19 Hthe target function input earlier. J At any time, a plot of average fitness against time can be produced by clicking on "Plot Data". The algorithm can be paused/re-started by clicking on "Stop Alg" and "Start Alg". New target functions can be provided, as described earlier, and the algorithm allowed to run with the current population distribution or with a new distribution by clicking on "Reset Alg". J While the algorithm is paused, various program variables can be changed by clicking the mouse button while the cursor is hovering over the _u_p and _d_o_w_n arrows next to values displayed in the large box spanning the top of the screen. Pressing and holding the mouse button enables variables to be changed quickly. J Note that the large box spanning the top of the screen contains two pages, only one of which is displayed at a time. These can be thumbed through by clicking on the arrows at the very top right of the display, immediately to the right of the copyright message. The first page is called "Simple Genetic Algorithm" which allows algo- rithm variables to be adjusted. The second page, called "General Program Control Variables", allows selection of different data for plotting on the output graph, changes to the scale of output graph axes and so on. J Note that the algorithm will only run while the "Simple Genetic Al- gorithm" page is selected. Alternative algorithms, simulated an- nealing for example, will be implemented by providing additional pages of algorithm variables. G20 H_7. _A_p_p_e_n_d_i_x _C - _P_r_o_b_l_e_m_s _a_n_d _H_o_w _t_o _F_i_x _T_h_e_m J This appendix is intended to help sort out problems encountered when trying to run the genetic algorithm Workbench program. J If you are unable to fix any problems using the list of potential problems and solutions which follow, it is wise to take the follow- ing steps before giving up in despair: J (1) Prevent installation of any unnecessary resident programs such as pop-up utilities, disk caches or command line editors. These are often installed by commands in your autoexec.bat file and could be the source of your problem. J (2) Prevent installation of any unnecessary device drivers. These are usually installed by commands in your config.sys file such as "device=filename". J The only resident program or device driver that you will need in- stalled is your mouse driver which may be installed by either of the above methods depending on the software supplied with your mouse. _P_r_o_b_l_e_m: _D_i_s_p_l_a_y _i_s _s_q_u_a_s_h_e_d J The top third of the screen is blank, but everything is displayed correctly in the lower two thirds of the screen. The mouse cursor may also be missing. J Your display adapter is in the wrong mode, possibly VGA mode. Refer to your display adapter manual for details of how to set it into EGA mode. _P_r_o_b_l_e_m: _D_i_s_p_l_a_y _c_o_r_r_u_p_t _o_r _b_l_a_n_k J Your display adapter is probably in the wrong mode. Refer to your display adapter manual for details of how to set it into EGA mode. _P_r_o_b_l_e_m: _N_o _m_o_u_s_e _c_u_r_s_o_r J The mouse cursor is not visible but it is possible to highlight op- tions in the command menu by moving the invisible cursor towards the menu and "circling" with the mouse. J G21 HThis probably indicates a problem with your mouse driver. It may not be compatible with this mode of your display adapter. Note that the Workbench program operates in _g_r_a_p_h_i_c_s mode, and so this problem may occur even if you have used your mouse with the display in EGA _c_h_a_r_a_c_t_e_r mode (which would display a block mouse cursor). G22 H_8. _A_p_p_e_n_d_i_x _D - _G_e_n_e_r_a_l _I_n_t_r_o_d_u_c_t_i_o_n _t_o _G_e_n_e_t_i_c _A_l_g_o_r_i_t_h_m_s J The following is an article which appeared in the Guardian newspaper on 14 September 1989. _W_h_y _N_a_t_u_r_e _K_n_o_w_s _B_e_s_t _A_b_o_u_t _D_e_s_i_g_n J _E_n_g_i_n_e_e_r_s _n_o_w _n_e_e_d _a _h_e_l_p_i_n_g _h_a_n_d _f_r_o_m _N_a_t_u_r_e _i_n _o_r_d_e_r _t_o _s_o_l_v_e _t_h_e_i_r _p_r_o_b_l_e_m_s. _M_a_r_k _H_u_g_h_e_s _r_e_p_o_r_t_s. J All life on earth, including its most intricate and ingenious features is the product of a genetic algorithm, known more commonly as evolution. However, genetic algorithms need not be confined to nature. They can be used to help solve many design and optimisation problems. Computer implementations of genetic algorithms are being used to tackle difficult problems in fields as far ranging as tur- bine blade design, automatic integrated circuit layout, and even in the training of neural networks. J All living things carry a kind of "blue-print" for their construc- tion in the DNA of each living cell. Over a period of time, changes (e.g. mutations) occur to the DNA giving rise to organisms which are more likely to survive, and so have a greater chance of passing their improved characteristics on to future generations. Of course, not all changes will be beneficial, but those which are not tend to die out. This is evolution. It is analogous to engineers making design changes in order to improve their company's product and so gain a competitive advantage or increase profitability. The genetic algorithm is the mechanism invented by nature for trying out altera- tions to DNA. J In the past, evolution has been perceived as a slow and hap-hazard process, relying on random mutations, and thus unsuitable for use in engineering. This perception caused problems to scientists trying to explain the rapid rate of evolution evident from the fossil record, and has been shown to be false. Although scientists have been aware of genetic operators other than random mutation for some time, it was not until the 1970s that a mathematical analysis re- vealed their importance (Holland 1975). Holland showed that nature's genetic algorithm was highly efficient at search and optimisation, and in no sense hap-hazard. J In engineering, computer simulations of genetic algorithms can be used to evolve better designs for a variety of systems. The comput- er is used to maintain a population of competing designs each with its own computer representation of DNA (usually a binary string). Here, instead of determining animal characteristics such as size, number of limbs and eye colour, the "DNA" is decoded to produce J G23 Hdesign characteristics such as lengths, angles, equations or rules. At each generation, the better (cf. fitter) designs are chosen to reproduce, and operators borrowed from nature are used to make changes to their "DNA" in the search for improvements. The designs are then tested by decoding the "DNA", and over several generations the population evolves and improves the criteria chosen by the en- gineer. J The freedom to select fitness criteria allows genetic algorithms to be applied in many fields. For example, you may wish to evolve rules for trading in financial markets, improve the aerodynamics of a vehicle, or simply solve an abstract mathematical function. But why use a genetic algorithm, when in the last case for example, there are plenty of established methods for solving mathematical functions? J The reason is that existing methods are fine so long as the problem is not too complex. A genetic algorithm allows extremely difficult functions to be solved efficiently - even the design of a living or- ganism. J In engineering terms, the strengths of genetic algorithms can be summarised by their abilities to cope with a variety of very diffi- cult problems, to work without prior knowledge about the function being optimised, to optimise "noisy" functions, and to do without secondary information such as gradients. In plain language they can cope with the difficulties represented by real-life problems which are generally insoluble by other methods. J A recent and most impressive testament to the fact that genetic al- gorithms are now coming of age has been provided by General Electric in the USA who used the technique to design an improved gas turbine blade. Their computer model indicates efficiency improvements of 2% for the design, a significant saving in this field, and they are currently spending around $1m verifying the prediction. If this succeeds they will spend $70m re-tooling their production line to produce the new type of blade. J Applications in addition to those mentioned already include gas pipeline management, medical image registration, adaptive filter design and mechanical structure optimisation. But engineering is not the only area to benefit. Genetic algorithms have been used to generate rules for financial trading systems, to classify forensic evidence, and to identify insurance risks. J A lack of computer power has been a factor limiting the usefulness of genetic algorithms as they sometimes require large amounts of computation, particularly if complex computer models are involved. But this is no longer such a problem because computing power has be- J G24 Hcome relatively cheap. Even very difficult problems can now be solved because of the efficiency with which genetic algorithms can be implemented on today's parallel computer architectures. J Engineering problems are getting so complex that man's intuitive ap- proach to design is becoming too primitive. Genetic algorithms are one of the tools that engineers are using to make up for their short-comings. J _R_e_f_e_r_e_n_c_e: _H_o_l_l_a_n_d _J. _H. (_1_9_7_5). _A_d_a_p_t_a_t_i_o_n _i_n _N_a_t_u_r_a_l _a_n_d _A_r_t_i_f_i_- _c_i_a_l _S_y_s_t_e_m_s. _U_n_i_v. _o_f _M_i_c_h_i_g_a_n _P_r_e_s_s: _A_n_n _A_r_b_o_r, _M_I. G25 H_9. _A_p_p_e_n_d_i_x _E - _M_a_i_n _C_o_m_m_a_n_d _M_e_n_u J The functions of the command menu shown in the top left of the screen are as follows. J GOption Function HRedraw Redraws the whole screen Start Alg/Stop Alg Start/pause algorithm operation Step Alg No function Reset Alg Generate initial random population Plot Data Plot graph of average or best fitness Plot Targ Re-plot the target function Enter Targ Enter or re-enter target function Quit Exit the program Test Tests my jump-up menus (no function) J J G26 @